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SIMULTANEOUS ANALYSIS OF A SEQUENCE OF PAIRED 
ECOLOGICAL TABLES: A COMPARISON OF SEVERAL 

METHODS 

By Jean Thioulouse 

University of Lyon and CNRS 

A pair of ecological tables is made of one table containing environ- 
mental variables (in columns) and another table containing species 
data (in columns) . The rows of these two tables are identical and cor- 
respond to the sites where environmental variables and species data 
have been measured. Such data are used to analyze the relationships 
between species and their environment. If sampling is repeated over 
time for both tables, one obtains a sequence of pairs of ecological 
tables. Analyzing this type of data is a way to assess changes in 
species-environment relationships, which can be important for con- 
servation Ecology or for global change studies. We present a new data 
analysis method adapted to the study of this type of data, and we 
compare it with two other methods on the same data set. All three 
methods are implemented in the ade4 package for the R environment. 

1. Introduction. Ecological data analysis has been very productive in 
the second part of the 20th century. Many original multivariate data analy- 
sis methods have been developed, particularly those designed to tackle the 
fundamental issue of Ecology: the description of the relationships between 
species and their environment. 

These methods study the relationships between species and their environ- 
ment through two data tables, called a "pair of ecological tables." The first 
of these two tables contains environmental variables (in columns) recorded 
in a set of sampling sites (rows). These variables are usually quantitative, 
physico-chemical properties, for example, and they can also be categorical. 

The second table of the pair is the species table, containing species data 
recorded at the same sampling sites. This can be the number of organisms, 
their presence/absence, or an abundance level. The rows in this table cor- 
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respond to the sampling sites, and its columns correspond to the species. 
Doledec and Chessel (1994) present a short review of linear ordination meth- 
ods for studying species-environment relationships, and the paper by Dray, 
Chessel and Thioulouse (2003) features a comparison of the advantages and 
disadvantages of recent methods. 

Ecologists are also interested in the changes in the relationships between 
species and environment [Guisan and Zimmermann (2000)]. Indeed, varia- 
tions of these relationships can be important, for example, from the point 
of view of species conservation or for global change studies. When sampling 
is repeated in time (or in space), one gets a sequence of tables, also called 
a fc-table. When a pair of ecological tables is repeated, the result is a pair of 
fc-tables, or two data cubes. One sequence of tables makes one data cube, and 
a sequence of pairs of tables makes a pair of data cubes (a species data cube 
plus an environmental data cube). Analyzing the relationships between the 
two cubes can give useful insights into the evolution of species-environment 
relationships. 

From the point of view of statistical methods, two approaches can be 
contrasted: the descriptive strategy and the predictive strategy. The aim of 
the first one is an objective description of the data set and of the relation- 
ships between its components. The second approach is orientated toward the 
prediction of "explained" (or "dependent") variables by "explanatory" (or 
"independent") ones. This distinction implies an asymmetry of predictive 
methods and a symmetry of descriptive ones. Indeed, descriptive methods 
do not differentiate between "explained" and "explanatory" variables. 

This difference has consequences on computational constraints: predic- 
tive methods have a matrix inversion step that is not present in descriptive 
methods. This matrix inversion step has negative consequences on the data 
sets that can be analyzed: it means that "explanatory" variables must be 
independent (in the statistical sense), that is, that they must be linearly 
independent, because the rank of their correlation matrix must not be less 
than its dimension. This also implies that the number of cases (samples) 
must not be less than the number of explanatory variables. 

This constraint is particularly important since the advent of bioinformat- 
ics, with the huge data sets provided by high throughput molecular biol- 
ogy methods like DNA microarrays and DNA fingerprints. These methods 
can produce extremely large data tables with very low information density. 
There are potentially thousands of variables, corresponding, for example, to 
electrophoresis bands or to DNA sequence tags. 

Conversely, descriptive methods can be used without constraint on the 
ratio between the number of samples and the number of variables. The main 
body of this paper is restricted to this approach. Figure 1 shows a diagram 
describing the methods that we are going to present, with the corresponding 
data structures. Abbreviations are given in the rest of this introduction. 
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FlG. 1. Diagram describing data structures in various experimental conditions: one table, 
one table with groups of rows, two tables, one k-table, two tables with groups of rows, and 
two k-tables. Data analysis methods corresponding to these situations are given on the right 
of each data structure. Abbreviations are given in the text (see Introduction). 



In the context of ecological data analysis, the distinction between predic- 
tive and descriptive methods is particularly important in two cases: when 
samples (the rows of data tables) belong to groups, and in the case of a pair 
of ecological tables. 

When groups of samples are involved, Linear Discriminant Analysis (LDA) 
[Venables and Ripley (2002)] is a classical example of a predictive approach. 
Between-Group Analysis (BGA) [Doledec and Chessel (1987); Culhane et al. 
(2002)] is a descriptive method analogous to LDA, but it can be used even 
when the number of samples is less than the number of variables. It can be 
seen simply as the PCA of the table of group means. Within-Group Analysis 
(WGA) [Doledec and Chessel (1987)] is the reverse of Between-Group Anal- 
ysis: it is the PCA of the residuals between initial data and group means. 
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It removes the effect of the grouping variable and analyzes the remaining 
variability. 

The standard for the analysis of one pair of ecological tables is Canoni- 
cal Correspondence Analysis (CCA) [ter Braak (1986), not to be confused 
with Canonical Correlation Analysis (CANCOR)]. Canonical Correspon- 
dence Analysis can be seen as a particular form of Correspondence Analysis 
(CA) [Benzecri and Coll (1973); Hill (1973)] or of Multiple Correspondence 
Analysis (MCA) [Tenenhaus and Young (1985)], where sample scores are 
constrained to be a linear combination of environmental variables. It be- 
longs to the predictive approach, and involves a regression step (including 
a matrix inversion). It is therefore restricted to the case where explanatory 
variables (usually environmental parameters) are linearly independent and 
not too many. Redundancy Analysis (RDA) [Legendre and Legendre (1998)] 
is similar to Canonical Correspondence Analysis, but it is a constrained PCA 
instead of a constrained Correspondence Analysis. 

On the other hand, Co-Inertia Analysis (COIA) [Doledec and Chessel 
(1994); Dray, Chessel and Thioulouse (2003)] belongs to the descriptive ap- 
proach. It a simple and robust alternative to Canonical Correspondence 
Analysis when the number of samples is low compared to the number of ex- 
planatory (environmental) variables. Co-Inertia Analysis can be seen as the 
PCA of the table of cross-covariances between the variables of the two tables. 
Other advantages are detailed in Dray, Chessel and Thioulouse (2003). 

A'-table analysis methods are used to analyze series of tables. They belong 
to three families: STATIS [Lavit et al. (1994); Escoufier (1973)], Multiple 
Factor Analysis (MFA) [Escofier and Pages (1994)], and Multiple Co-Inertia 
Analysis (MCOIA) [Chessel and Hanafi (1996)]. Partial Triadic Analysis 
(PTA) [Thioulouse and Chessel (1987)] is one of the simplest analyses of the 
STATIS family, and it can be seen as the PCA of a series of PCAs. 

Multivariate analysis methods for pairs of data cubes are not widespread. 
Two of them are based on co-inertia: the first one is called Between-Group 
Co-Inertia Analysis (BGCOIA) [Franquet, Doledec and Chessel (1995)], and 
the second one is the STATICO method [Simier et al. (1999); Thioulouse, 
Simier and Chessel (2004)]. In this paper we present a new method called 
COSTATIS, and we compare it with BGCOIA and STATICO. The name 
STATICO means "STATIS and CO-inertia," while COSTATIS means "CO- 
inertia and STATIS." STATIS is a French abbreviation for "Structuration 
des TAbleaux a Trois Indices de la Statistique" (organization of three way 
tables in Statistics). 

The comparison of the three methods is done on the same data set as the 
one used by Franquet, Doledec and Chessel (1995) and by Thioulouse, Simier 
and Chessel (2004). BGCOIA and STATICO have already been presented 
in biological journals, so we briefly present their methodological bases. The 
COSTATIS method has never been presented before, so we explain it here. 
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Fig. 2. The example data set consists of two data cubes. The first one contains 10 envi- 
ronmental variables that have been measured four times (in Spring, Summer, Autumn and 
Winter) along six sampling sites. The second one contains the numbers of Ephemeroptera 
belonging to 13 species, collected in the same conditions. 



We compare the results of the three methods from a rather practical point 
of view, on their respective graphical outputs for the same data set, and on 
their global properties. 

Functions for the R software [R Development Core Team (2009)] to per- 
form computations and graphical displays for the three methods are available 
in the ade4 package [http://pbil.univ-lyonl.fr/ade4/ see Chessel, Du- 
four and Thioulouse (2004); Dray, Dufour and Chessel (2007)]. All the com- 
putations and graphical displays can be redone interactively online, thanks 
to this reproducibility page: http://pbil.univ-lyonl.fr/SAOASDPET/. 

2. Example data set and basic analyses. The three data cube coupling 
methods presented here are based on three different basic analyses: Between- 
Group Analysis, Co-Inertia Analysis and Partial Triadic Analysis. 1 In this 
section we give a short summary of these basic analyses in the framework 
of the duality diagram. We start the section by presenting a description of 
the example data set. 

2.1. Example data set. We present the results of the three data cube 
coupling methods using graphical displays obtained on the same data set. 
This data set is the one used by Franquet, Doledec and Chessel (1995) and by 
Thioulouse, Simier and Chessel (2004). Numerical data are printed in both 
papers, and they are also available in the R package "ade4," in the "meau" 
data set. A picture of these data as two data cubes is given in Figure 2. 
They are arranged in two tables: one table with 24 rows and 10 columns, 
containing the environmental variables, and one table with 24 rows and 13 
columns, containing the species data. 

The rows of both tables correspond to 6 sampling sites ordered upstream- 
downstream along a small stream, the Meaudret (South-East of France, 



1 BGCOIA is based on Between-Group Analysis plus Co-Inertia Analysis, STATICO is 
based on Co-Inertia Analysis plus Partial Triadic Analysis and COSTATIS is based on 
Partial Triadic Analysis plus Co-Inertia Analysis. 
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in the Vercors massif). Site 6 is a control, located on another stream, the 
Bourne, into which the Meaudret flows. These 6 sites are sampled 4 times, 
in Spring, Summer, Autumn and Winter. The 10 environmental variables 
are physico-chemical measures: water temperature, flow, pH, conductivity, 
oxygen, biological oxygen demand (BOD5), oxidability, ammonium, nitrates 
and phosphates. Most of these variables are related to water pollution. In- 
deed, there is a large summer mountain resort (Autrans) located between 
sites 1 and 2, and its influence is predominant. 

The 13 columns of the species data table correspond to 13 Ephemeroptera 
species (mayflies), which are known to be highly sensitive to water pollution. 
These species are as follows: 

Eda = Ephemera danica, Bsp = Baetis sp., Brh = Baetis rhodani, Bni = 
Baetis niger, Bpu = Baetis pumilus, Cen = Centroptilum, Ecd = Ecdyonu- 
rus, Rhi = Rhithrogena, Hla = Habrophlebia lauta, Hab = Habroleptoides 
modesta, Par = Paraleptophlebia, Cae = Caenis, Eig = Ephemerella ignita. 

The goal of the analysis of this data set is to enlighten the relationships 
between Ephemeroptera species distribution and the quality of water. More 
precisely, data analysis methods should help discover how these relationships 
vary in the space-time experimental setup (i.e., according to seasons during 
one year and along the stream). 

2.2. PCA duality diagram. The duality diagram [Escoufier (1987); Hol- 
mes (2006)] will be used in this paper to present several methods, so we 
explain it here first for a simple Principal Component Analysis (PCA). Let 
X = [£ij]( n , jP ) be the data table (environmental variables table, for exam- 
ple), with n rows (sampling sites) and p columns (variables). X T is the 
transpose of X. Let D n be the diagonal matrix (n x n) of site weights: 
D n = diag(iwi, . . . , w n ), and let D p be a metric on MP. The duality diagram 
of the general analysis of the data table is as follows: 



D, 



D„ 



This is called a "duality diagram" because MP and M n are the dual spaces 
of MP and M n , and because the dual operators X r D n XD p and XD p X T D„ 
share the same spectrum. This diagram is completely defined by the "triplet 
notation": (X,D p ,D n ), and the total inertia of this statistical triplet is 

I x = trace(XD p X T D n ). 
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The generalized PCA (gPCA) of this triplet corresponds to the spec- 
tral decomposition of X r D„XD p . When D n is the matrix of uniform row 
weights (wi = 1/n), and D p is the identity (Euclidean metric), then this 
analysis is a simple PCA, and if the variables are centered, the inertia is the 
sum of their variances. 

This duality diagram can be seen as a picture of the underlying math- 
ematical objects used in the theoretical description of the analysis. It has 
several functions, like making it easier to remember the characteristics of 
particular methods (mnemonic), finding out the matrices that are needed to 
perform computations (for example, doing one way around the diagram gives 
the matrix from which eigenvalues and eigenvectors should be extracted), 
or where particular objects (like row scores and variable loadings) can be 
found and how to compute them. Dray and Dufour (2007) give a detailed 
description of the use of the duality diagram in multivariate ecological data 
analysis and in the ade4 package for the R environment. 

2.3. Between-Group Analysis. Between-Group Analyses [Doledec and 
Chessel (1987); Culhane et al. (2002)] are a particular class of analyses, 
similar in their aim to linear discriminant analysis, but comprising no ma- 
trix inversion step. They consist in the summary (for example, by a PCA) of 
the table of group means. In a second step, the rows of the initial table are 
projected in this PCA to get row scores for all observations. The advantage 
is that there is no constraint on the number of observations compared to 
the number of variables, and no problem with numerous and/or correlated 
variables, as it is the case in LDA. There are several types of between-group 
analyses, corresponding to the initial analysis after which Between-Group 
Analysis is computed. This can be, for example, a PCA, a Correspondence 
Analysis, or a Multiple Correspondence Analysis. 

In Between-Group Analysis, samples belong to g groups, namely, G±, . . . , 
G g , with group counts n\, . . . , n g , and Yl n k = n. Between-Group Analysis is 
the analysis of triplet (X#,D p ,D 9 ), where X# is the (g,p) matrix of group 
means: 

X B = [Xj](g,p)- 

The term Xj = ^- X]jeG fe x v 1S ^ e mean °f variable j in group k. In matrix 
notation, if B is the matrix of class indicators, B = with = 1 if 

i € Gk and b\ = if i £ Gk , then we have 

X B = D 3 B r X. 

Matrix Y) g = Diag(— ) is the diagonal matrix of (possibly nonuniform) 
group weights, and B T is the transpose of B. The corresponding duality 
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diagram is the following: 



D, 



D 9 

Between-Group Analysis is therefore the analysis of the table of group 
means, leading to the diagonalization of matrix X^D 3 X#D p . It's aim is 
to highlight the differences between groups, and row scores maximize the 
between-group variance. The statistical significance of these differences can 
be tested by a permutation test, with a criterion equal to the between/total 
variance ratio. Row scores of the initial data table can be computed by 
projecting the rows of table X on the principal component subspaces. 

2.4. Co-Inertia Analysis. The first presentation of Co-Inertia Analysis 
dates back to Doledec and Chessel (1994), but almost ten years later, Dray, 
Chessel and Thioulouse (2003) gave a more detailed presentation and com- 
pared it with Canonical Correspondence Analysis. Just as inertia is a sum of 
variances, co-inertia is a sum of squared covariances, and Co-Inertia Anal- 
ysis describes the co-structure between two ecological data tables by sum- 
marizing as well as possible the squared covariances between species and 
environment. Here is a short description of this analysis. 

Let X be the first table (environment variables table), with n rows (sam- 
pling sites) and p columns (variables), and let Y be the second table (species 
data), with the same n rows and q columns (species). X T and Y T are the 
transpose of X and Y. Let D n be the diagonal matrix (n x n) of site weights: 
D n = diag(wi, . . . ,w n ), and let D p and D g be two metrics on W and W re- 
spectively. Before doing the Co-Inertia Analysis, we need to analyze each 
table separately. The duality diagrams of the separate analyses of the two 
data tables are as follows: 



D„ 



D,- 



A generalized PCA of these triplets corresponds to the spectral decom- 
position of X T D n XD p and Y T D n YD„. When D n is the matrix of uniform 
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row weights (wi = 1/n), and D p and D q are identity (Euclidean metrics), 
then these analyses are simple PCA. 

Co-Inertia Analysis is defined by the duality diagram obtained by merging 
these two separate diagrams. This will be possible when they have the same 
spaces M. n and R n in common, which implies that the rows of the two tables 
must be identical. The "coupled diagram" of Co-Inertia Analysis is therefore 




Co- Inertia Analysis is the eigenanalysis of matrix X D„YD ? Y D n XD p 
(starting in W). This is equivalent to the following "crossed diagram": 



D, 



X 1 D„Y 



Y'D„X. 







R q 







D„ 



This diagram highlights the fact that Co-Inertia Analysis is the analy- 
sis of a cross product table, and its triplet notation is (Y T D n X, D p , D g ). If 
the columns of both tables are centered, then the total inertia of each table is 
simply a sum of variances: Jx = trace(XDpX T D n ) and Iy = trace (YD g Y T D n ) 
And the co-inertia between X and Y is in this case a sum of squared co- 
variances: 

Co/xy = trace(XD p X T D n YD (? Y T D n ). 

Co-Inertia Analysis maximizes the covariance between the row scores of 
the two tables [Dray, Chessel and Thioulouse (2003)]. Co-inertia is high 
when the values in both tables are high simultaneously (or when they vary 
inversely) and low when they vary independently or when they do not vary. 
This is interesting from an ecological point of view: Co-Inertia Analysis 
will show species that are abundant when some environmental variables are 
particularly high (or low), and it will discard species that are not influenced 
by these environmental variable. This is the meaning of the co-structure 
between the two data tables. 



10 J. THIOULOUSE 

The above "coupled diagram" shows the similarity of Co-Inertia Analysis 
with Canonical Correlation Analysis (CANCOR). Indeed, the only difference 
between the Co-Inertia Analysis and Canonical Correlation Analysis duality 
diagram [Cailliez and Pages (1976), p. 352; Holmes (2006)] comes from the 
metrics on W and W. q : 




Canonical Correlation Analysis uses the Mahalanobis metric on MP and R q , 
whose matrices are the inverse of the covariance matrices Vu = X r D n X and 
V22 = Y T D n Y. This leads to the Canonical Correlation Analysis triplet: 
(Y T D n X,(X T D n X)- 1 ,(Y r D n Y)" 1 ). Canonical Correlation Analysis row 
scores maximize their correlation, but this can be achieved with very small 
variances. By maximizing the covariance instead of the correlation, Co- 
Inertia Analysis ensures that the scores do not have very small variances, 
and therefore have a good percentage of explained variance in each space. 

This diagram also clarifies the link between Co-Inertia Analysis and in- 
strumental variable methods like Principal Component Analysis with respect 
to Instrumental Variables (PCAIV) [Rao (1964)], and particularly Canonical 
Correspondence Analysis and Redundancy Analysis, which are primordial 
in ecological data analysis: 



V" 1 

v 11 



D,; 



D„ 



In instrumental variables methods, the Mahalanobis metric is used in only 
one of the two spaces, most often the environmental variables space (BP). 
This corresponds to the situation where one wants to explain species distri- 
bution by linear combinations of environmental variables, and this leads to 
the usual PCAIV triplet: (Y T D n X, (X T D n X)-\ D g ). 

2.5. Partial triadic analysis. Partial Triadic Analysis [Thioulouse and 
Chessel (1987)] belongs to the STATIS family of the fc-table analysis meth- 
ods [see, for example, the special issue of the journal Comput. Statist. Data 
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Anal. 18(1) (1994), or Stanimirova et al. (2004)]. The STATIS family can 
be thought of as providing a PCA of a set of PCA's. In ordinary PCA, the 
data table is summarized by a vector (principal component), and in STATIS 
methods, the /c-table is summarized by a matrix. Partial Triadic Analysis 
is the most simple of these methods, but it is also the most restrictive one. 
Its aim is to analyze a series of k tables having the same rows and the same 
columns. This means that the same variables must be measured at the same 
sampling sites, several times. Partial Triadic Analysis, like any STATIS-like 
method, follows three steps: interstructure, compromise and infrastructure 
(also called "trajectories"). 

The interstructure step provides the coefficients of a special linear combi- 
nation of the data tables, leading to an optimal summary called the "com- 
promise." The second step computes the PCA of this linear combination. 
The infrastructure step is a projection of the rows and columns of each table 
of the series into the multidimensional space of the compromise analysis. 

The interstructure is based on the concepts of "vector variance" and "vec- 
tor covariance" [Escoufier (1973)]. It constructs a matrix of scalar products 
between tables (the vector covariance matrix) that can be written simply 
Covv(X i i c , X;) = Trace(X^D n X;D p ), where X& is the kth table from the 
series. The eigenanalysis of this vector covariance matrix gives a first eigen- 
vector, and the components of this first eigenvector are used as weights 
to compute the compromise. 

Alternatively, a "vector correlation" matrix can be used, that rescales the 
tables: Rv(Xfc,Xj) = Covv(Xfc, Xj)/ -y/Varv(Xfc) Varv(X/). Varv(Xfc) is the 
vector variance of table k: Varv(Xfc) = Trace(X^ D n XfcDp). It is simply the 
usual variance of the vector obtained by putting all the columns of table X^ 
one below the other. 

The compromise X c is a linear combination of the initial tables, weighted 
by the components of the first eigenvector of the interstructure: X c = 
^ fc afcXfc. The inertia of this compromise is maximized, and its main prop- 
erty is that it maximizes the similarity with all the initial tables, as measured 
by the sum of their squared dot product, ^2 k (X. c , X^} 2 . When the tables are 
normed, the dot product is the Rv coefficient. 

The weight of each table is proportional to its inertia, so tables that 
are different from the others will be downweighted. This property ensures 
that the compromise will resemble all the tables of the sequence "as closely 
as possible" in a least square sense. The analysis of this compromise, for 
example, by a PCA, gives two-dimensional representations (principal axes 
maps) that can be used to interpret its structure. 

The infrastructure is obtained by projecting the rows and columns of 
each table of the series in the analysis of the compromise (i.e., the rows 
are projected on the principal axes, and the columns are projected on the 
principal components). This step is done in the same way as the projection of 
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supplementary elements in a simple PCA [see, for example, Lebart, Morineau 
and Warwick (1984), page 14]. Let U be the matrix of the eigenvectors of 
the analysis of the compromise. The scores of the rows of table are = 
XfcDpU, and the coordinates of its columns are = X^D n X c D p UA -1 / 2 , 
where A -1 / 2 is the diagonal matrix of the inverse of the square root of the 
eigenvalues of the analysis of the compromise. 

The advantage of Partial Triadic Analysis is that it highlights the "stable 
structure" in a sequence of tables. The compromise step displays this stable 
structure (when it exists) , and the infrastructure step shows how each table 
moves away from it. 

3. Data cube coupling methods. In this section we present three meth- 
ods for analyzing a pair of data cubes: BGCOIA, STATICO and COSTATIS. 
The principle of each method is briefly explained, and the result obtained 
on the example data set is detailed. 

Figure 3 shows a comparison of the three approaches. BGCOIA is a bet- 
ween-group co-inertia analysis. It is therefore simply computed by doing 
a Co-Inertia Analysis on the two tables of group means, considering each 
table as a group [Franquet, Doledec and Chessel (1995)]. In STATICO, we 
first use Co-Inertia Analysis k times to compute the sequence of k cross- 
covariance tables, and then Partial Triadic Analysis to analyze this new Ac- 
table. In COSTATIS, we first use two Partial Triadic Analyses to compute 
the compromises of the two fc-tables, and then Co- Inertia Analysis to analyze 
the relationships between these two compromises. 

3.1. BGCOIA. Let g be the number of groups. The table of group means 
for environmental variables is obtained by computing the means of each vari- 
able within each group. This gives a new table, with g rows and p columns. 
The same computations are done for the species data table, leading to a sec- 
ond new table with g rows and q columns. A simple Co- Inertia Analysis is 
performed on these two new tables. The rows of the initial tables can be 
projected into this analysis to help interpret the results [Lebart, Morineau 
and Warwick (1984)]. 

The main advantage of this method is its simplicity, from both theoretical 
and practical points of view. The two data cubes are reduced to two tables 
by taking the means of the columns of each elementary table of the cubes. 
Co-Inertia Analysis is then applied to the two resulting tables. 

The fact that it is a Between-Group Analysis can be used to give more 
importance to the spatial or the temporal effect in a space-time experimental 
design. Tables can correspond to dates or to sampling sites and, depending 
on the importance that should be given to space or time processes, one 
or the other of the two possibilities can be used. On the example data set 
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Fig. 3. Comparison of the three setups. BGCOIA is a between- group co-inertia analysis, 
considering each table (date or site) of the sequence as a group. STATICO is a Partial 
Triadic Analysis on the series of cross product tables obtained by crossing the two tables 
of a pair at each date. COSTATIS is a co-inertia analysis of the compromises computed 
by the Partial Triadic Analysis of the two k-tables. In BGCOIA, the mean of the variables 
in each table is computed and arranged in two new tables. A Co-Inertia Analysis is then 
done on this couple of new tables. In STATICO, k cross-covariance tables are computed 
from the two k-tables, resulting in a new k-table. A Partial Triadic Analysis is then done 
on this new k-table. In COSTATIS, two Partial Triadic Analyses are used to compute 
the compromises of the two k-tables. A Co-Inertia Analysis is then used to analyze the 
relationships between these two compromises. 



used here, Pranquet, Doledec and Chessel (1995) considered that one table 
corresponds to one sampling site, so we use the same setup. 

Note that the symmetric method, Within-Group Co-Inertia Analysis (WG- 
COIA) based on Co-Inertia Analysis and Within-Group Analysis [Pranquet 
and Chessel (1994)] can also be used to analyze one effect after removing 
the other, which results in four possible setups and four different analyses 
(between dates, between sites, within dates and within sites). 
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Figure 4 shows the results of the between-group co- inertia analysis; it cor- 
responds to Figure 4 of Franquet, Doledec and Chessel (1995). Figure 4A is 
the principal axes map of the rows of the cross product table (Ephemeroptera 
species) , Figure 4B is the principal axes map of the columns (environmental 
variables), and Figure 4C is the principal axes map of the sites. Figures 4A 
and 4B are obtained directly with the row scores and column loadings of 
the cross product table, while Figure 4C is obtained by projecting the rows 
of the two sequences of tables as supplementary elements into the co-inertia 
analysis space. 

The 48 points on Figure 4C correspond to the 6 sites, sampled 4 times, and 
there is one set of points for the environmental variables table sequence (open 
circles) and one set of points for the Ephemeroptera species table sequence 
(black circles) . The columns of these two sequences of tables (environmental 
variables at each date and Ephemeroptera species at each date) could also 
be projected into the co-inertia analysis, but this has not been done here. 

The four points corresponding to the four sampling dates for each site are 
grouped to form a star, and the barycenter of these four points is labeled 
with the number of the site (white background label for the environmen- 
tal variables table sequence, gray background label for the Ephemeroptera 
species table sequence). 

The interpretation of Figure 4C is simple. The first axis (Figure 4B, hor- 
izontal) is a pollution gradient from left (unpolluted situation: high con- 
centration of oxygen and high pH) to right (highly polluted situation: high 
concentrations of amonium and phosphates, high conductivity and oxydabil- 
ity, high biological oxygen demand). The second axis (Figure 4B, vertical) 
is an upstream-downstream physical gradient: discharge ("Flow") increases 
downstream (upward on the figure). Most Ephemeroptera species are more 
abundant in unpolluted situations (Figure 4A, horizontal), and some species 
are characteristic of the lower part of the stream (Bsp, Eig, Ecd), while others 
(Bpu, Hla, Eda) are characteristic of the upper part, or of site 6 (Figure 4A, 
vertical) . 

Figure 4C shows that the spatial component of the phenomenon is more 
important than the temporal aspect. The first axis (Figure 4C, horizontal) 
opposes unpolluted sites (site 1, upstream the Autrans summer mountain 
resort, and site 6) to highly polluted sites (site 2, just downstream Autrans, 
and receiving the ouputs of the sewer system). Sites 3, 4 and 5 are less 
and less polluted. This corresponds to the natural restoration process: the 
pollution is gradually resorbed along the stream. Figure 4C also shows that 
the biological processes are very linked to the physico-chemical variations of 
water quality along the stream: the two ordinations of sites are very similar. 

3.2. STATICO. The STATICO method is based on the Partial Triadic 
Analysis of a sequence of cross product tables (see Figure 3). Starting from 
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Fig. 4. First two principal axes maps of the between- group co-inertia analysis (see text 
for details). The eigenvalues corresponding to these two axes are equal to 70.2 and 4-45- 
The scale is given by the value (d) in the upper right or lower left corner of each plot; it 
corresponds to the size of the background grid. (A) Map of the rows of the cross product 
table (Ephemeroptera species). (B) Map of the columns (environmental variables). (C) Map 
of the sites. The 48 points in (C) correspond to the 6 sites (labeled 1-6), sampled 4 times, 
with one set of points for the environmental variables table sequence ( open circles, white 
background site labels) and one set of points for the Ephemeroptera species table sequence 
(black circles, gray background site labels). 
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the sequence of paired ecological tables, each cross product table is com- 
puted using the pair of tables at each date. All the tables of the sequence do 
not need to have the same number of rows, but they need to have the same 
number of columns across dates. This means that the number of sampling 
sites can vary from one date to another, but the number of environmen- 
tal variables (p) must be the same for all the dates, and the number of 
species (q) must also be the same for all dates. Therefore, all the cross prod- 
uct tables have the same number of rows (p) and columns (q). They contain 
the covariances between the columns of the two tables. 

Let (X;,,, Dp, D nfe ) and (Y^, D g , D nfe ) be the pair of triplets at date k. 
is the table of environmental variables measured at date k, and Y& is the 
table of species observed at the same date. D p and D g are the same for all 
the dates and D nfe = Diag(— ) is the same for both tables. Let be the 

kth. cross product table: = Y^D nfc Xfc. The Co-Inertia Analysis triplet 
at date k is (Zfc,D p ,D g ) and the STATICO method is the Partial Triadic 
Analysis of the /c-table made by this series of cross product tables. 

The interstructure step gives optimal weights such that the inertia of 
the triplet a^Z^, D p , D g ) is maximum with the constraint X^fc a fc = 1- 

The compromise of the STATICO method (Z) is a weighted mean of the 
cross product tables using weights a^: Z = X^ctfcZfc [Simier et al. (1999)]. 
The analysis (PCA) of this compromise gives a graphical display of the 
environmental variables (rows of Z) and of the species (columns of Z). 

Finally, the intrastructure step projects the rows and columns of each 
table of the sequence in the analysis of the compromise, with usual sup- 
plementary element projection technique [Lebart, Morineau and Warwick 
(1984)]. This gives a display of the environmental variables at each date, of 
the species at each date, and two displays of the sampling sites at each date 
(one from the point of view of environmental variables and one from the 
point of view of species). 

The results of the STATICO method are presented in Figures 5-7. Fig- 
ure 5 is a compound graph that sums up the first two steps of the STATICO 
method: interstructure and compromise. Figure 6 shows the intrastructure 
step for environmental variables and Ephemeroptera species (i.e., the pro- 
jection of the columns of the tables of the two sequences as supplementary 
elements in the compromise analysis), and Figure 7 is the intrastructure 
step for the sites (i.e., the projection of the rows of the tables of the two 
sequences). 

The STATICO method is a Partial Triadic Analysis on the sequence of 
cross product tables, so the compromise is also a cross product table, with 
the 13 Ephemeroptera species in rows and the 10 environmental variables in 
columns. Sites have disappeared from this table, but they can be projected 
as supplementary elements to help interpret the results of the analysis. 
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Fig. 5. Results of the STATICO method. This is a compound graph, automatically drawn 
by the "plot" function of the ade4 package for a partial triadic analysis. The eigenvalues 
corresponding to the two axes of the compromise analysis are equal to 593.6 and 45.3. 
The scale is given by the value (d) in the upper right corner of the compromise plots; it 
corresponds to the size of the background grid. The four plots are as follows: (A) The 
interstructure plot, showing the four seasons, and the importance of the corresponding 
tables in the definition of the compromise (coordinate of the points on the first axis). 
(B) Compromise analysis principal axes map (environmental variables). (C) Compromise 
analysis principal axes map (Ephemeroptera species). (D) Typological values of the four 
tables (square cosines vs. table weights). 



The interstructure plot (Figure 5A) shows that Autumn and Summer are 
the two most important seasons for defining the compromise, while Winter 
and Spring are slightly less important. 

The compromise plots (Figure 5B and C) are very similar to the BGCOIA 
plots (Figure 5A and B). They show that the first axis (horizontal) is also 
a pollution gradient: clean water on the right, and pollution on the left. The 
second axis (vertical) is also an upstream-downstream physical gradient: 
discharge ( "Flow" ) and temperature ( "Temp" ) increase downstream (down- 
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Fig. 6. Results of the STATICO method: variable infrastructure step. The scale is given 
by the value (d) in the upper right corner of each plot; it corresponds to the size of the 
background grid. The four plots on the left (A) show the environmental variables at each 
date (Spring, Summer, Autumn, Winter), and the four plots on the right (B) show the 
Ephemeroptera species at the same dates. 

ward on the figure). Nitrates ("Nitr") also increase along the whole stream 
instead of having a maximum at site 2 like other pollution variables, and this 
is why they are located here. The sensitivity of all Ephemeroptera species to 
pollution and the specificity of some species (Bpu, Hla, Eda upstream and 
Bsp, Eig, Ecd downstream) are also found again. 
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Fig. 7. Results of the STATICO method: site infrastructure step ("trajectories"). The 
scale is given by the value (d) in the upper right corner of each plot; it corresponds to the 
size of the background grid. (A) trajectories of sites for environmental variables. (B) tra- 
jectories of sites for Ephemeroptera species. These plots show the distortions of the up- 
stream- downstream gradients across seasons, as Ephemeroptera species react to pollution 
increase (maximum reached in Autumn) or decrease (minimum in Spring). 

The "typological value" plot (Figure 5D) shows that Autumn has the 
highest influence in the construction of the compromise, while Spring has 
the lowest. 

Figure 6 shows the intrastructure step for environmental variables (Fig- 
ure 6A) and Ephemeroptera species (Figure 6B). It is drawn using the pro- 
jection of the columns of the two sequences of tables as supplementary ele- 
ments in the compromise analysis. 

Autumn is clearly the season where the structures are the strongest (ar- 
rows are much longer at this date) , both for environmental variables and for 
Ephemeroptera species. Conversely, Spring is the season where the structures 
are the weakest (arrows are all very short). This confirms the interpretations 
made on Figure 5. However, although the structures may vary in intensity, 
they are preserved across dates: the first axis is always a pollution gradient, 
and the second one is always an upstream-downstream opposition. 

Figure 7 shows the intrastructure step for the sites. It is drawn using the 
projection of the rows of the two sequences of tables as supplementary ele- 
ments in the compromise analysis. It is very similar to Figure 4C, but it is 
split according to seasons instead of sites. The environmental variable plot 
(Figure 5A) and the Ephemeroptera species plot (Figure 7B) are placed side 
by side. This presentation insists on the comparison between the four sea- 
sons, showing mainly the distortions of the upstream-downstream gradient 
across seasons. 
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The differences among the seasons are shown clearly in Figure 7. In Spring, 
sites are lined up on the upstream-downstream gradient and only site 2 
moves slightly to the left. In Summer, pollution is highest at site 2, and 
restoration occurs along sites 3, 4 and 5. In Autumn, pollution is maximum 
because stream flow is at its minimum (pollutant concentrations are max- 
imum). In Winter, pollution has almost disappeared, because Autrans is 
a Summer mountain resort, but the upstream-downstream gradient is still 
disturbed. 

Figure 7B shows the same structures, because the pollution has a negative 
impact on Ephemeroptera species abundance (horizontal axis) and because 
of the upstream-downstream preferences of particular species (vertical axis). 

3.3. COSTATIS. COSTATIS is a new method that is also based on Ac- 
table methods and on co-inertia, but it benefits from the advantages of both 
STATICO and BGCOIA. Indeed, it has the same optimality properties of 
k-table analyses as STATICO (i.e., the maximizing properties of the com- 
promise), but it has the simplicity of BGCOIA. 

COSTATIS is simply a co-inertia analysis of the compromises of two k- 
table analyses (see Figure 3). The first step of COSTATIS consists in per- 
forming two Partial Triadic Analyses: one on the environmental variables 
fc-table, and one on the species /c-table. The second step is simply a co- 
inertia analysis of the compromises of these two Partial Triadic Analyses. 
This means that the number of tables does not have to be the same for 
the two series of tables, but that the number of species, of environmental 
variables, and of sampling sites must be the same for all the tables. 

X c = afcXfc is the (n x p) compromise of the Partial Triadic Analysis 
of environmental variables, and Y c = /3fcY^ is the {n x q) compromise of 
the Partial Triadic Analysis of species data. These compromises are weighted 
means of the tables of the original sequences, with weights equal to the 
components of the first eigenvector of the interstructure of the two Partial 
Triadic Analyses. The inertias of the triplets (X c ,D p ,D n ) and (Y c ,D g ,D n ) 
are maximum under the constraints a\ = l and = 1. 

The Co-Inertia Analysis of these two compromises decomposes the total 
co- inertia: 

CoI XcYc = trace(X c D p X^D n Y c D 9 Y c T D n ), 

and maximizes the co-inertia between species and environmental variable 
scores. An additional step can be implemented, like in the STATICO method: 
it is possible to project the rows and columns of all the tables of the two 
sequences as supplementary elements into the multidimensional space of this 
Co-Inertia Analysis. 

Each compromise represents the "stable structure" of the corresponding 
sequence: X c is the stable structure of the environmental tables sequence, 
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and Y c is the stable structure of the species tables sequence. COSTATIS 
brings to light the relationships between these two stable structures, and it 
discards the conflicting variations between the whole sequences. It is there- 
fore very easy to interpret (like a standard Co- Inertia Analysis), yet it re- 
tains the optimality properties of the compromises of the two Partial Triadic 
Analyses. 

COSTATIS results are presented in Figure 8. COSTATIS is a co-inertia 
analysis, and it is therefore possible to use a permutation test to assess the 
statistical significance of the relationships between the two tables, just like 
in a usual Co-Inertia Analysis. The result of this permutation test on the 
Meaudret data set gave a p- value of 1%. 

The co-inertia analysis is done on the compromises of two /c-table analyses. 
Here, we used two Partial Triadic Analyses, but the results of these two 
analyses are not presented. We show only the plots of the co-inertia analysis, 
under the form of two biplots: one for environmental variables (Figure 8A), 
and one for Ephemeroptera species (Figure 8B). These two biplots could, in 
fact, be superimposed on the same figure, but the outcome was cluttered. 

Presenting the results in this way underlines the fact that COSTATIS 
is looking for the relationships (co-structure) between the stable structures 
extracted from two series of tables. Figure 8A shows the results for the envi- 
ronmental variables. The same structure as the one detected by STATICO 
and BGCOIA is observed. Axis 1 is the pollution gradient (pollution on 
the left) and axis 2 is the upstream-downstream opposition (downstream is 
upward) . 

The four dates for each site are projected on this plot and, like in the BG- 
COIA plot (Figure 4C), the four points corresponding to the four sampling 
dates of each site are grouped to form a star. The gravity center of these 
four points is labeled with the number of the site. The four points of site 2 
are on the left, as pollution is higher in this site for the four dates (except 
for site 3 in Winter). Pollution decreases downstream along sites 3, 4 and 5, 
and is the lowest at site 6. 

The second biplot is presented in Figure 8B. It shows the Ephemeroptera 
species, with the same opposition between upstream and downstream char- 
acteristic species. In the same way as in Figure 8A, the four dates for each 
site are projected on the plot and the corresponding four points are grouped 
to form a star. The gravity center of these four points is labeled with the 
number of the site. The position of sites corresponds to the abundance of the 
species in these sites: sites 2 and 3 have the lowest number of Ephemeroptera, 
so they are far on the left. Site 1 has the highest number of species "Eda," 
and sites 5 and 6 have the highest number of species "Bsp," "Brh" and "Eig." 

The first axis common to these two biplots (i.e., the first COSTATIS 
axis) maximizes the covariance between the coordinates of the "compromise 
variables" and the "compromise species." The result is that it displays the 
relationships between the stable structures extracted from two data sets. 
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Fig. 8. Results of the COSTATIS method (first two axes maps). The eigenvalues cor- 
responding to the two axes are equal to 34-52 and 6.695. The scale is given by the value 
(d) in the upper right corner of each plot; it corresponds to the size of the background 
grid. (A) First biplot of the co-inertia analysis between the compromises of the two Partial 
Triadic Analyses, with sites and variables superimposed. (B) Second biplot, with sites and 
species superimposed. These two biplots could be superimposed. 
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On this example, this relationship is the fact that the pollution gradient 
affects the abundance of Ephemeroptera species. The second axis represents 
the upstream-downstream opposition, and the relationships between ecolog- 
ical preferences of Ephemeroptera species and physical variables or stream 
morphology. 

4. Discussion. The three methods presented here uncover the same fea- 
tures in the example data set. This is a small data set, but with strong 
structure, and strong structures often are clear with any method. However, 
the three methods used to analyze even a data set with clear structure can 
have advantages and drawbacks. The advantages of these methods can be 
summarized as follows: 

BGCOIA: It is the most straightforward method. It is simple to apply and 
outputs are easy to interpret. It can be used to favor one point of view 
(for example, space vs. time), by choosing the factor of Between-Group 
Analysis. It can also be used in conjunction with WGCOIA to study an 
effect (time) after removing the other (space). 

STATICO: The main advantage of this method is the optimality of the 
compromise (maximization of the similarity with all the initial tables). 
It gives a compromise of co-structures, which means that it displays the 
stable component of species-environment relationship variations. It ben- 
efits from the three-steps computation scheme of STATIS-like methods 
(interstructure, compromise, intrastructure) , and graphical outputs can 
be very detailed. 

COSTATIS: This method benefits from the advantages of the two others: op- 
timality of the Partial Triadic Analysis compromises, ease of use, simplic- 
ity of co-inertia analysis graphical outputs. COSTATIS is the co-inertia 
analysis of two compromises, so it looks for the relationships between two 
stable structures. This is different from the STATICO point of view (co- 
structure of two compromises vs. compromise of a series of co-structures). 

The three methods can also be compared from the perspective of the 
possible objectives of a data cube coupling strategy. The first objective is 
to find a "consensus" in the relationships between species and environment. 
This consensus should be independent from the repetitions (time or space), 
and the three methods achieve this in different ways. 

In COSTATIS, a consensus is extracted first, separately and independent- 
ly for environmental variables and species. The relationships between these 
two summaries are then investigated by a co-inertia analysis. In STATICO, 
species-environment relationships are first analyzed at each date, and a stab- 
le summary of these relationships is then computed. 

If species-environment relationships are weak, or present only at some 
dates, they may disappear after the first step of COSTATIS (the two sepa- 
rate Partial Triadic Analyses) and the final co-inertia analysis permutation 
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test may be nonsignificant. Conversely, if species-environment relationships 
are very strong, chronological structures may disappear in STATICO. CO- 
STATIS should therefore be preferred when species-environment relation- 
ships are strong and chronological structures are not of primary importance. 

Another objective of data cube coupling strategy, complementary to the 
first one, can be the search for a description of the evolution of species- 
environment relationships (like seasonal variations or long term changes), 
rather than a description of the stable part of these relationships. In this 
case, STATICO may be more appropriate than COSTATIS, as it computes 
a consensus of species-environment relationships at each date, and only after 
builds a time consensus. 

BGCOIA is slightly different, because it makes easy a choice in the initial 
analysis between a spatial or a chronological setup. It should be used only 
when there are good reasons to give the priority to space or to time. But on 
the other hand, WGCOIA can be used after the BGCOIA, to remove the 
primary effect (space or a time) chosen in BGCOIA. 

Great care must be taken in the choice of the factor defining the groups 
for the BGCOIA method. Franquet, Doledec and Chessel (1995) explain 
why they chose a between-site (as opposed to between-season) co-inertia 
analysis on the example data set. In Hydrobiology, seasonal variations are 
mostly linked to water temperature, and the corresponding between-season 
structures are trivial (Summer- Winter opposition). But in other situations, 
a between-date analysis could be an interesting strategy. This choice of the 
grouping factor, combined with the possibility to use WGCOIA after a BG- 
COIA gives four different analyses (between-site, between-date, within-sites 
and within-date) that can be used to explore complex data sets. 

For fc-table methods, the way of organizing the /c-table in a series of k 
tables is also important. For a three-way array (sites x variables x dates), 
there are three ways to cut the data cube into a series of tables. However, 
only two are really interesting. Indeed, the option "one table = one vari- 
able" is not coherent with the aim of the analysis: a compromise between 
physico-chemical variables would have no meaning. So we have to choose 
between "one table = one date" (as done in the present paper) or "one ta- 
ble = one site." This choice is dictated by the objectives of the study and 
also by the fact that the method will try to compute a compromise as a 
linear combination of the tables. This means that this compromise should 
be meaningful. 

A third point of comparison between data cube coupling methods is the 
numerical constraints put on the parameters of the fc-table, that is, the 
number of species, of environmental variables, of sampling sites, and of dates. 
From this point of view, BGCOIA, STATICO and COSTATIS share the 
same constraints on the species and environmental variables, which should 
always be identical: same species and same environmental variables for all 
dates. 
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But the constraints are different for dates and sampling sites. In CO- 
STATIS, the two series of tables can have different numbers of dates (and 
even different dates), while the sampling sites must be the same for all the 
tables at all dates. In STATICO, the two series of tables must have the same 
dates, but sites can differ among dates (although they must be equal for the 
two tables of a pair) . Constraints from the experimental design can therefore 
influence the choice of the method. 

Moreover, the constraints on species and environmental variables come 
from the choice of the fc-table analysis in COSTATIS and STATICO (a Par- 
tial Triadic Analysis). Extensions of these methods can be imagined, that 
would use another variant of STATIS-like analyses instead of Partial Tri- 
adic Analysis. For example, using STATIS on operators (the classical ACT 
method) [Lavit et al. (1994)] would lead to a "site COSTATIS" method al- 
lowing the use of both different species and different environmental variables. 
We could also define a "species STATICO" allowing the use of different en- 
vironmental variables among dates, and a "variable STATICO" allowing the 
use of different species among dates. 

This possibility of having varying species, environmental variables, dates, 
and sampling sites makes the use of the three methods much more flexible, 
but only COSTATIS allows the use of different species and different envi- 
ronmental variables. However, it should be used with care, as this flexibility 
might be obtained at the expense of loosing some of the structures in the 
data set. 

A drawback common to all these methods is the relative complexity of ex- 
ploratory multivariate data analysis. In this area, the "ade4" package for the 
R environment [R Development Core Team (2009)] tries to make things eas- 
ier. Simple function syntax and structured objects have been privileged. In 
addition, a graphical user interface is available in the "ade4TkGUI" package 
[http://pbil.univ-lyonl.fr/ade4TkGUI/, Thioulouse and Dray (2007)], 
and k-table methods will be implemented in this interface. Moreover, all 
the computations and graphical displays in this article can be redone inter- 
actively online, thanks to this reproducibility page: http://pbil.univ-lyonl. 
fr/SAOASOPET/. 

The availability of methods able to analyze data sets with a complex or- 
ganization, like pairs of data cubes, is important because it allows to take 
into account this organization and to analyze the data sets globally. There 
are alternatives to these methods, like analyzing stacked tables, or perform- 
ing several separate analyses, like time series analysis for each variable, or 
functional data analysis on each table. But the analysis is facilitated by 
taking into account the data structure as dictated by the experimental de- 
sign. Exploring species-environment relationships is not an easy task, and 
adding spatial and temporal influences makes it even more difficult, but this 
is a necessary step toward the comprehension of ecosystem functioning. 
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